conformity score
Debiased Machine Learning for Conformal Prediction of Counterfactual Outcomes Under Runtime Confounding
Barnatchez, Keith, Josey, Kevin P., Nethery, Rachel C., Parmigiani, Giovanni
Data-driven decision making frequently relies on predicting counterfactual outcomes. In practice, researchers commonly train counterfactual prediction models on a source dataset to inform decisions on a possibly separate target population. Conformal prediction has arisen as a popular method for producing assumption-lean prediction intervals for counterfactual outcomes that would arise under different treatment decisions in the target population of interest. However, existing methods require that every confounding factor of the treatment-outcome relationship used for training on the source data is additionally measured in the target population, risking miscoverage if important confounders are unmeasured in the target population. In this paper, we introduce a computationally efficient debiased machine learning framework that allows for valid prediction intervals when only a subset of confounders is measured in the target population, a common challenge referred to as runtime confounding. Grounded in semiparametric efficiency theory, we show the resulting prediction intervals achieve desired coverage rates with faster convergence compared to standard methods. Through numerous synthetic and semi-synthetic experiments, we demonstrate the utility of our proposed method.
- Asia > Singapore (0.04)
- North America > United States > Tennessee (0.04)
- North America > United States > Pennsylvania (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Europe > Netherlands > South Holland > Rotterdam (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Croatia > Primorje-Gorski Kotar County > Rijeka (0.04)
- Asia (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.67)
- Health & Medicine (0.93)
- Education (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.85)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Asia > Singapore (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (2 more...)
31b3b31a1c2f8a370206f111127c0dbd-Paper.pdf
This frameworkcanaccommodate almost anychoice of conformity scores, and in fact many different implementations have already been proposed to address ourproblem. However,itremains unclear howtoimplement aconcrete method fromthis broad family that can lead to the most informative possible prediction intervals.
- North America > United States > California (0.14)
- Asia > Middle East > Israel (0.05)
endfor
The first method, explained in Section A1.4.1, consists of directly calibrating a sequence of nested two-sided intervals, as outlined in Section 3.3. The second method, explained in Section A1.4.2, consists of separately calibrating two sequences of lower and upper one-sided confidence intervals, each adopting the significance level α/2 instead of α. Pu j=l ˆϕj(x)amongthefeasible ones with minimal |u l|, whenever the optimization problem does not have a unique solution. Therefore, we can assume without loss of generality that (1) has a unique solution; if that is not the case, we can break the ties at random by adding a little noise to ˆϕ. For any integer T 1, consider an increasing sequence tτ [0,1], for τ {0,...,T}. A nested sequenceofT intervalsindexedbyτ {0,...,T},whichmaybewrittenintheformof St = ˆLm,α(Xm+1;tτ), ˆUm,α(Xm+1;tτ), for appropriate lower and upper endpoints ˆLm,α(Xm+1;tτ) and ˆUm,α(Xm+1;tτ), respectively, is then constructed from (1) as follows.
- North America > United States > California (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- (2 more...)
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)